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Abstract 
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contexts) but their continuation-passing transforms may not be. We also show that two 
terms may be congruent in all untyped contexts but fail to be congruent in a language with 
call/cc operators, and that two terms may have the same meaning in a direct semantics 
but not in a continuation semantics. Hence, familiar reasoning about terms may be unsound 
in a setting with continuations, demonstrating the need for a theory of continuations. 
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Chapter 1 


Introduction 


Continuations control program flow using purely functional means. Informally, a contin- 
uation is a function representing the rest of the program: when passed an intermediate 
result (a value in a functional language, a store in an imperative language), the function 
“continues” the computation to the final result. In LISP programs, for example, the control 
stack can be thought of as representing the continuation of a program: the stack tells the 
interpreter how to continue the computation to the final answer. At a lower level, a program 
counter also represents a continuation, although the “function” may not be very clear. 

The explicit use of continuations pervades the theory and practice of programming 
languages. Continuations first appeared in continuation-style semantics for imperative lan- 
guages [11, 30, 31]. In this style, continuations are explicitly passed to the meanings of all 
program statements. The meaning of imperative statements can be modeled as functions 
that change the continuation. For example, in an ALGOL-\ike language with goto <label> 
statements, each label marks a particular continuation. The meaning of the statement 
goto <label> is one that, upon receiving a store and a continuation, discards that contin- 
uation and passes the store to the continuation associated with <label>. Highly imperative 
constructs like goto are difficult or impossible to represent in “direct” semantics in which 
statements are modeled as functions from answers to answers [11, 30]. 

Continuations appear in at least two other settings. In languages such as LISP and 
Scheme, the continuation of a program may be accessed through the control operator 
call-with-current-continuation (call/cc) [23]. The programmer may then use the 
continuation to repeat certain calculations, perform error traps, backtrack through a com- 
putation, or simulate forks and joins [10]. Continuations have also been used in compilers 
for languages such as Scheme and ML. These compilers apply a continuation-passing style 
(cps) transform as a fundamental step in compilation [1, 9, 28]. 

Each of the three settings involves “programming” with continuations, and it is almost 
self-evident that this requires a different style of thinking. What is not obvious, however, 
is whether working in a continuation setting requires new reasoning tools. Indeed, certain 
principles should remain valid in the context of continuatic ns. For example, the substitution 
of actual parameters for formal parameters in procedure calls should not become invalid— 
otherwise, the addition of continuations would change the programming language in drastic 
ways! 

On the other hand, the mere addition of continuation-based control operators to lan- 
guages suggests that continuations change programming in a fundamental way. In the 
presence of control operators, a programmer may be able to distinguish pieces of code that 
were indistinguishable without control operators, making the language more powerful. One 
can make similar arguments for the other two settings. For instance, programs not ex- 
pressible when programming directly in the language become expressible when using cps 


converted code. 

This thesis attempts to make precise the intuition that continuations “change things” 
in the three settings of continuations. Using specific counterexamples, we shall prove that 
certain familiar reasoning principles are unsound in the three settings of continuations. In 
essence, reasoning about code in the usual way may lead one to draw faulty conclusions 
about the behavior of that code. By understanding the failure of reasoning principles in 
each of the three settings of continuations, we move closer to understanding continuations 
themselves; insights generated by the examples will help in building a suitable theory of 
continuations. 


1.1 Reasoning about Code 


By “reasoning principles” we mean principles for proving equivalences of code. Such prin- 
ciples capture the notion of “behavior of code.” For example, a A-abstraction applied to an 
integer argument in LISP behaves the same (ignoring efficiency issues) as the body of the 
abstraction with the integer in place of the abstracted variable. These two pieces of code are 
equivalent, and the definition of a LISP interpreter may be used to verify this equivalence. 

Two pieces of code are “equivalent” if they produce the same “outcomes” under the 
interpreter. To make this more precise, we must define the observations, the net outcomes 
of the interpreter considered important. Typically, we choose to observe terms at which the 
interpreter stops. In the language X, defined in Chapter 2, we will observe evaluation to 
numerals.! Let Eval(M) be a partial function from terms to terms, representing the output 
of the interpreter on terms; we then say 


Definition 1.1 (Informal) Two terms M and N are observationally equivalent if 
Eval(M) and Eval(N) agree on all observations. 


Two programs are observationally equivalent if they produce the same observable results. 

Observational equivalence states that two terms as g ven cannot be told apart by the 
interpreter. For languages with functional terms, observational equivalence is too coarse; 
one may still be able to distinguish two observationally equivalent terms. For instance, if we 
choose to observe “termination of the interpreter” in LISP, any two A-abstractions would 
agree on all observations and hence would be considered observationally equivalent. Yet 
a programmer may be able to distinguish two A-abstractions by writing a contezt (a term 
with a hole) that makes the terms evaluate to different observations. One may formalize 
this ability to distinguish terms: 


Definition 1.2 (Informal) Two terms M and N are observationally distinguishable 
iff for some context C{-], C[M] and CN] differ on some observation (in other words, are 
not observationally equivalent.) 


The complementary notion is, in fact, more important: 


Definition 1.3 (Informal) Two terms M and N are observationally congruent (writ- 
ten M =.o5 N) iff they are not observationally distinguishable. 


‘More complex observations may result in finer distinctions bety.2en terms; see (4, 17] for an example of 
another reasonable notion of observation. 


Observational congruence is the congruence closure of observational equivalence. 

From a software engineering perspective, observational congruence captures the notion 
of “modularity” of code. For example, two routines that “sort” should be observationally 
congruent: the “sort” routines should be interchangeable in any program, and the program 
should produce the same answers using either routine. Observational congruence also pro- 
vides one definition of a “correct” compiler optimization: if one piece of code is replaced by 
a faster yet observationally congruent piece, the optimization is “safe,” t.e., the optimized 
code will still produce the expected answer. 

When we say “reasoning about code,” we mean reasoning used to prove observational 
congruences. In fact, almost any reasoning principle may be viewed as a way to verify 
observational congruences. For instance, fixpoint induction in denotational semantics and 
pure A-calculus-like equational reasoning are reasoning tools for proving congruences. These 
formal reasoning principles help justify the informal observational congruence reasoning 
used by programmers, clarifying common assumptions about the behavior of code. 


1.2 Outline of Thesis 


We concentrate on the setting of cps conversion, since the cps transform seems fundamental 
to understanding the other two settings of continuations. a continuation transform forms 
the basis of many continuation semantics (cf. [24, 26, 30]) and is often used to describe 
the semantics of call/cc-like operators (cf. [7, 8].) Chapter 2 describes a call-by-value 
functional language A, and its continuation transform, both of which are the focus of study. 
In Chapter 3, we describe specific examples that show the failure of reasoning princi- 
ples based on observational congruence. These examples will have the form “M and WN are 
observationally congruent but not congruent in one of the continuation settings.” In par- 
ticular, we show that two terms may be observationally congruent but their cps-transforms 
may not be. Similar observations are also made for the other two settings of continuations. 
The unsoundness of familiar reasoning principles indicates that a theory of continuations 
remains to be found. Chapter 4 discusses possible directions for such a theory. One method 
(currently being pursued) involves extending the retraction-based method of Meyer and 
Wand [15]. One might also seek results tying the three settings of continuations together. 
Finally, an Appendix is included which contains proofs of “standard” theorems for X,. 


Chapter 2 
The Language and its Continuation Transform 


This chapter defines A,, a call-by-value version of the language PCF [20, 25], including an 
interpreter for A,. A call-by-value continuation transform for the language is then given, 
along with theorems that show the correctness of the transform. 


2.1 Syntax 


The familiar syntax of the simply-typed A-calculus forms the basis of A,. Each term in A, 
has a type of the form o or (ao — 7), where o is the sole base type, the type of natural 
numbers, and o — 7 is the type of functions from o to r.! The set of terms with their 
corresponding types is defined in Figure 2.1. In this definition and throughout the text, 
Greek letters (with the exception of k, A, and 1) denote types, uppercase Roman letters 


— ,-variables, where z € £ 
f?:o0 —wp-variables, where f € M 
cj:o — numerals (I > 0) 

succ,pred:o—o0 -— functional constants 
(cond B M N):o — conditionals, where B,M,N:0 

(M N):rt — applications, where M:o0—-7 and N:o 
(Az?.M):o0—7 — d-abstractions, where M : 7 

(uf?.-M):o — recursive definitions, where M:o 


Figure 2.1: The syntax for A,; here, £ and M are two disjoint, infinite sets of variables. Each 
variable in A, is tagged with a type (cf. [20]), but types will often be dropped when the context is 
clear. 


denote terms, the lowercase letters f, g, and A are j-variables, and all other letters (e.g., 
K,@,b,c) are A-variables except when otherwise stated. 

The A- and yw-variables occurring in a term may be bound or free [2]. If two terms M@ 
and N differ only in the names of bound variables, we consider them to be syntactically 
equivalent and write M = N [2]. A term is closed if it contains no free variables; otherwise, 
a term is open. 

Contexts are special terms containing holes. A context C[-] is derived from a term M 
by replacing all free occurrences of some variable in M, say f%, by a hole [-]. C[N] is the 
result of replacing every hole in C[-] with N, where N : o and the type of the hole is a. 


As is customary, parentheses will frequently be dropped from types with the understanding that — 
associates to the right. For example, 0 — o — 0 is short for (0 -+ (0 — 0)). 


| (Az.M)V =, M{a:=V], V a value uf.M =>, M[f:=pf-M] || 
sucCC] —>y Cl4) predco —-y Co 

cond cg My My —y, Mo pred Cl41 uv C1 
cond cj41MoMi -—y, My 


BoB M —, M',c € {succ, pred} 


cond B My M, —, cond B! My My, eM >, cM’ 
M—, M' N =, N'! 
i MN-—,M'N (z2.M) N =, (z.M) N’ _ | 


Figure 2.2: Structured rewrite rules for A,. Substitution of the term N for the variable z in M, 
with the necessary renaming of bound variables, is written M[z := N] (see [2] for a formal definition.) 


2.2 Operational Semantics 


The relation —,, the one-step reduction relation on terms of X,, is defined in Figure 2.2 
using a structured operational semantics [19, 21]. In reducing applications, operands are 
substituted in for A-bound variables only when the operand is a value. A value (usually 
denoted by V) is a A-abstraction, a constant, or a A-variable. None of these terms can be 
rewritten using —,, so a value is a term in evaluated form.” 

It is relatively easy to see from the fact that values are stopped that —., is deterministic. 
This allows us to define an interpreter for A, from >,. Since A, is a language for arithmetic, 
we choose the final answers of the interpreter to be numerals. The input to an interpreter 
for A, should therefore be closed terms of base type which we call complete programs. 
(A complete program is a program coupled with a particular set of inputs.) The reflexive, 
transitive closure of the relation —,, -»,, can be used to define a partial recursive function 


Eval,: Complete programs — Numerals 


Eval,(M) = { CI if My ¢] 


undefined otherwise 


which is an interpreter for the language. 
In our investigation of the cps transform we will be most interested in reasoning about 
the behavior of code under Fval,. We say that 


Definition 2.1 M observationally approximates N, written M ~<, N, if, for any con- 
text C[-] such that C[M] and C[N] are complete programs, C[M] -», ¢; implies C[N] + c1- 


Two terms M and N are observationally congruent, written M =%,, N,if M <, N and 
N <x, M. 

Observational congruences can be difficult to prove using only the definition [12]. For 
example, consider the terms Ny = Az.(Ay.y) cz and Nj = Az.c3. If Ny is applied to an 


?Using this rationale, p-variables might also be considered values, if it were not for the fact that p- 
variables may be replaced by terms that require further evaluation. For example, f gets replaced by a 
non-value in the reduction pf.f >. f[f := wf.f]. In contrast, A-variables remain values when reduced and 
hence are considered values. This distinction explains the need for two disjoint sets of variables. Plotkin 
also uses two sets of variables in one version of his metalanguage [22]. 


argument during the evaluation of a program, the “active” subterm at the next stage will 
be (Ay-y) cz which will reduce to cg. If Nz appeared as the subterm instead, cg will again 
be the result. The terms should thus be congruent. This argument, however, is difficult to 
formalize and is of little use in proving other observational congruences. 

Equational reasoning based on —, can be used to prove N; =",, N2. Define the relation 
=, by replacing all —,’s in the definition of >, by =,’s, adding the axioms reflexivity, 
symmetry, and transitivity, and condensing the operation-.) rules with antecedents into the 
congruence rule 

M =, M' 
C[M] =, C[M"] 


where C[-] is any context (nct necessarily making C[M] a complete program.) The rules of 
=, are sound for proving observational congruences. 


Theorem 2.2 If M =, N, then M =3,, N. 


Proof: Delayed to the Appendix. a 


N, =%,, N2 now follows from the fact that Ny =, No. 

The converse to Theorem 2.2 is false: there are terms that are observationally congru- 
ent but cannot be proven equivalent.? The following theorem will be useful in verifying 
congruences: 

Theorem 2.3 Let M and N be closed terms of the same type. Then M x, N iff, for all 
vectors V of closed values, M V -», Vg implies N V -», Vj and Vj = V{ if either is a 
numeral, 


Proof: Delayed to the Appendix. a 


Theorem 2.3 states that applicative contexts determine observational congruence (cf. [3].) 


2.3. Continuation Transform 


2.3.1 Definition 


The continuation transform for X, is based on a cps transform appropriate for call-by-value 
[9, 15, 19]. The transform of a term M, written M, is another term of A,. Figure 2.3 defines 
the transform of a term by structural induction on the term. 

The behavior of the interpreter for A, provides clues to understanding the continuized 
version of a term. Basically, the flow of control is made explicit by the continuations of 
a cps-converted term. For example, since values are not evaluated, the cps transform of a 
value simply passes the value to a continuation (the rest of the program.) For applications 
as well, the continuations in the transform of an application mimic the flow of control in the 
interpreter: the continuation passed to the operator first evaluates the operand and passes 
control to the operand’s continuation, which, in turn, applies the operator to the operand. 

The explicit incorporation of continuations requires that the transform change the type 
of a term. A continuized term accepts a continuation as an argument (a function from some 
type to a final answer), and produces a final answer given that continuation. The type of 
final answers for X, is 0, so a term of type o is transformed into a term of type (0 > 0) > 0. 


*In fact, observational congruence is not axiomatizable [2, 32], so one cannot hope for an equational proof 
system that captures observational congruence. 


ao. ss XR-E 
Fe = AK. flo’) « 
@= AkK.K ¢; 
suce = Ak.k (At°.AK.Ky (succ £)) 
pred = Ak.k (Ax°.AKy-Ky (predz)) 
cond B Myo My = AK.B(Am®.cond m (Mo k) (Mj &)) 
(MN) = AK.M (Am(e>7)'N (An®'.m 7 &)) 
(where M:o0—- 7 and N:a) 
Nt?.M = Ak.w (Az?’.M) 
IL pf.M = And(pf'7)-°.M) 6 


Figure 2.3: The continuation transform for A,. The types of continuations « (which have the form 
a’ — 0) have been omitted for clarity. Note that variables change types when transformed. 


The situation for higher-typed terms is more complicated. The continuation of a higher- 
order term needs to accept functions which, given a value and another continuation, produce 
final answers. The transform of a term of type a is thus a term of type (a’ — 0) — 0, where 
a’ is defined recursively by (cf. [15}) 


o = 0 


(asTyY = o =(1r'> 0) 0. 


2.3.2 Fundamental Properties of the Transform 


By inspecting the definition of the transform, one may observe that every operand in a 
transformed term is a value and hence need not be evaluated. In other words, transformed 
terms may be evaluated tail-recursively. Tail-recursiveness can lead to increased efficiency. 
A traditional call-by-value interpreter (or code generated by compilers) uses a stack to 
remember the position of the subterm currently being evaluated. In transformed terms, all 
operands in applications are in evaluated form, so an interpreter designed specifically for 
transformed terms does not require a stack.4 

A corollary to the fact that all operands are values is unambiguous reducibility: call- 
by-name and call-by-value reduction strategies coincide on transformed terms. Unambigu- 
ous reducibility allows one to use the transform to simulate call-by-value in a call-by-name 
interpreter, as is done in [19. 

Of course, the transform must satisfy correctness properties as well. If one expects 
to use the transform as a first step in compilation, for example, transformed terms must 
not produce different answers than the original terms! The continuation transform for the 
language satisfies two properties that guarantee its correctness: provable equality (7.e., =,) 
is preserved by the transform and complete programs produce the same output as their 
transformed versions [9, 19]. 


2.3.2.1 Preservation of equational reasoning 


We follow Plotkin’s proof in [19] to show that M =, N implies M =, N. 


“One may regard the cps version of a term as incorporating an ex glicit representation of the interpreter’s 
control stack. 


Substitutions performed by =, pose problems to a direct proof. Suppose, for example, 
that =, performs the substitution M[z := V]. We want (Az.M) V =, M[z := V]. In point 
of fact, it is easy to show that (Av.M) V =, M[z := W(V)], where 


Definition 2.4 If V is a vaue, then ¥(V) is defined 
0 W(x") = 27; 
e W(c)) = c1; 
e W(succ) = Ax.AKy.Ky (Succ x); 
e W(pred) = Az.AK1.K, (pred x); 
e (Ae?.M) = Ax7".M. 


(Essentially, Y(V) is V without the leading continuation.) The following lemma allows us 
to complete the argument that (Az.M) V =, M[z := V];: 


Lemma 2.5 If V is a value and x is a -variable, then M[z” := ¥(V)] = M[x? := V). 
Proof: By structural induction on M. For the base case, M must be a constant or variable: 
Case 1: M =. Then M[a := W(V)] = An. W(V) = V = M[e := Vi). 
Case 2: M = t for some variable t 4 x. Then M[z ::= W(V)] =#= M[a := V). 
Case 3: M =a for some constant a. Similar to Case 2. 
For the induction case, we also divide into cases depending on the form of M: 


Case 1: M = cond B My My. Then 


M[z := V(V)] (AK.B (Am.cond m (Mo «) (M1 «)))[z := B(V)] 
AK.B[z := V] (Am.cond m (Mo|[z := V] «) (Mile := V] &)) 
(by the induction hypothesis) 


M{[e := V]. 


tT 


Case 2: M = (M; M2). Then 


M[x:=(V)} = (AK.My (Am.M2 (An.m n &)))[z = (V)| 
\K.My [2 := V] (Am.Mo[2 := V] (An.m n &)) 
(by the induction hypothesis) 
Sareea) 


II 


Case 3: M = dy.M'. If y =2, then M[z := (V)] = M = M[z:=V]. fy ¥ 2, 


M[z := B(V)| (Ak.K (Ay.M"))[z = 2(V)| 
Ak.K (Ay.M’[2 := V]) 
(by the induction hypothesis) 


M{[a := V). 


Case 4: M = uf.M’. We know that x # f; so 


M[z:=8(V)) = (An.(uf.M") «)[z = ¥(V)| 
= Ak(pf.M'[2:=V])K 
(by the induction hypothesis) 
= We SV). 
We have exhausted all cases, hence the lemma holds. =| 


The analog of Lemma 2.5 for recursive definitions works somewhat more easily: 


Lemma 2.6 [f f is a p-variable, then M[f('79)7? := pj’) 2 N] = M[fe := wf?.N]. 


Proof: By structural induction on M. In the base case, we divide into cases on the form 


of M: 
Case 1: M =f. Then M[f := uf.N] = An.(uf-N) « = pf.N = M[f := pf.N]. 
Case 2: M =t for some variable t 4 f. Then M[f := wf.N] =t= M[f := pf.N). 
Case 3: M = a for some constant a. Similar to Case 2. 

For the induction case, there are four cases to consider: 
Case 1: M = cond B Mo My. Then 


Mf :=puf-N]) = (An.B (Am.cond m (Mo «) (Mj «&)))[f := wf-N] 
= AK.B[f i= pf.N] (Am.cond m (Mo[f := pf-N] «) (Milf := uf-N] «)) 
(by the induction hypothesis) 


= Mif=asN), 
Case 2: M = (Mj, M2). Thus, 
Mf :=pf-N]) = (An.My (Am.M2 (An.m n k)))[f := wf] 


ll 


AK.M,[f := pf-N] (Am.Mo[f := wf.N] (An.m n «)) 
(by the induction hypothesis) 
M({f := pf.N}. 
Case 3: M = ry.M’. Note that f # y; thus 
M[f :=pf.N) = (Anew (Ay.M))[f := wf.N] 
= AkikK (Ay.M'|f := pf.N]) 
(by the induction hypothesis) 


= Mf t= pisn). 
Case 4: M = pg.M'. Ifg = f, M[f:=puf-N] = M = M[f := uf.N]. On the other 
hand, if g # f, 
Mf :=efN] = (An(ug.M’) e)[f = wf.N] 


= AK(ug-Mf = wf.N)) 6 
(by the induction hypothesis) 
= MPS APN. 


This concludes the proof. | 


Given these two lemmas, we may complete the proof of the theorem: 
Theorem 2.7 If M =, N, then M =, N. 
Proof: By induction on the length n of the proof that M =, N. In the base case, the 
length of the proof is 1, so an axiom was used: 
Case 1: (Az.M) V =, M[x :=V], where V is a value. Recall that V = An.w W(V). 
Therefore, 
(Az.M)V =, Ak.(AKy.Ky (Az.M)) (Am.V (An.m n &)) 
=, AK(AmV (Anmn«)) (At.M) 
=y AKV (An(Az.M) nk) 
=, Ax.(An(Ar.M) n Kk) P(V) 
=, AK(Ax.M) PV) 
=, Ak(M[z:= V(V)]) « 
=, An.M[r:=V] K 
where the last equation follows from Lemma 2.5. Examining the continuation trans- 
form, we note that every continuized term begins with a A-abstraction; thus, 


AnM[z:=V|«K =, M[a := V] 
so (Az.M) V =, M[a := V]. 
Case 2: cond co Mp My; =, Mo. By calculation, 
cond ¢o Mp My =, AK.(AK1.K1 Co) (Am.cond m (Mo «) (My &)) 
=, Ak.cond co (Mo K) (Mj k) 
fa ee Sa 
Case 3: cond cj41 Mp M, =, My. Similar to the previous case. 


Case 4: succc; =, cj41. By calculation, 


succé; =, AK.(AKy.Ky (At.AK2.K2 (Succ 2))) (Am.(AK3.K3 7) (An.m n K)) 
=y AK. AK3.K3 1) (An.(At.AK2.K2 (succ 2)) 2 K) 
=y AkK.(Az.AK2.K2 (Succ Z)) cy K 


Sy. (ARK (SUCE C7) Sy. -AK.K Cin, Sy. “CELT 


Case 5: predco =, Co. Similar to the previous case. 
Case 6: predcj41 =, cj. Similar to the previous case. 
Case 7: wf.M =, M[f := pf.M]. By calculation, 
Ge He. Deaf ae 
=y Axk.(M[f := pf.M)) « 
=v AK(M[f = uf-M])« =, M[f:= uf-M], 


the third equation following from Lemma 2.6. 


10 


Case 8: M =, M. Trivial. 


In the induction case, the length of the proof is n + 1; again, we divide into cases, this 
time depending on the last rule used: 


Case 1: M =, N and N =, P implies M =, P. By the induction hypothesis, we know 
that M =, N and N =, P, so we can conclude that M =, P by the transitivity rule. 
Case 2: M =, N implies N =, M. Trivial. 


Case 3: M =, N implies C[M] =, C[N]. Using the i.uduction hypothesis M =, N, an 
easy structural induction on the context C[-] shows that C[M] =, C[N]. 


This list exhausts the possibilities for last rule used, hence we are done. a 


2.3.2.2 Adequacy 


Theorem 2.7 does not explain the correspondence of evaluation of terms and their cps- 
versions. For complete programs in particular, we expect the interpreter to give the same 
answers from both the direct and continuized versions, except that continuized versions 
must be passed a “default continuation,” viz., the identity function: 


M Hy Cl iff M (Az.2) Hy Cl. 


Indeed, this fact must hold if we wish to use cps conversion in compilers.® 
The proof proceeds using the method in [19]. The key observation is that certain reduc- 
tions on transformed terms have no corresponding reduction on non-continuized versions. 
For example, consider the complete program cs. The direct version cannot be reduced, but 
@ (Az.2) can be: 
(AK.K €5) (At.2) Sy (At.2) C5 a 65. 


The first reduction is callec an administrative reduction, since only a continuation is 
passed. The relation x applies a continuized term to a continuation and performs all possible 
administrative reductions: 


Definition 2.8 For any term M :o and any value K : o' — 0, we define Mx K by 
MxK KYM), if M is a value 


PORK = fECUTEL. if f is a p-variable 
{ cond ¥(B)(M, K) (Mz K) if B is a value 


lI 


(cond BM; M2) K B x (Am.(cond m(M, K) (Mz K))) otherwise 


My «(Am.M2 (Ansmn K)) if My is not a value 


(M; M2)xK = M2 x (An (M,) n K) if My, but not Mo, is a value 
(M,) (M2) Kk otherwise 
pf.M'xK = uf. M' K) 


The following lemma confirms that the definition actually represents a “partial reduction” 
of a continuized term: 


°Note that the => direction fol.ows from Theorems 2.2 and 2.7, but the converse does not follow directly. 
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Lemma 2.9 If K is a value, then M K +», Mx K. 


Proof: By structural induction on M. For the base case, divide into cases depending on 


M: 
Case1: M=2. Then M K = (\n.w x) K >, Ker =Mx K. 
Case 2: M = f. Then M K = (\w.f x) K >, f K=MxK. 
Case 3: M =c;. Then M K = (Ak.K ¢;) K 3, K cy = MxK. 
Case 4: M = succ. Then M K =, K (Az.\n.w succz) = Mx K. 


Case 5: M = pred. Similar to the previous case. 
For the induction case, 
Case 1: M = cond B M; Mz. Then 


MK =  (Xk.B (Am.cond m (My k) (M2 «))) K 
—, B(Am.condm(M, K) (M2 K)). 


If B is a value, then M K -, cond ¥(B) (M, K) (M2 K); otherwise, 


MK -, Bx(Am.condm(M, K)(Mz K))=MxK 
(by the induction hypothesis.) 


Case 2: M = (M, Mp2). If M, is not a value, 


MK = (AKn.My (Am.MQ (An.mn k))) K 
sy My (Am.M2 (An.mn K)) 
-», My *(Am.M, (Anmn K)) = Mx K 
(by the induction hypothesis.) 


If M, but not Mz is a value, 


MK -», Mp (An(M,)n K) 
+, Mgx* (AnW(M,)n K)= Mx K 
(by the induction hypothesis.) 


Finally, if both M, and Mg are values, 
M OK +, Mz (An.Y(My) n K) », &(M,) Y(M2) K = Mx K. 
Case 3: M = Ax.M'. Then 
M K = (Ak. (Az.M")) K >, K (At.M') = Mx K. 
Case 4: M = pf.M'. Then 
M K = (Ak.(uf.M") «) K >, (uf.M’) K = Mx kK. 
This concludes the proof of the lemma. a 


Once the administrative reductions on a continuized term have been performed, the 
next reductions correspond to reductions on the original version of the term: 
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Lemma 2.10 If M —, N and K is a value, then Mx K », NxK. 


Proof: By induction on the length of proof of M —, N. In the base case, the length of 
the reduction is 1; we divide into cases depending on the operational rule used: 


Case 1: (At.M’) V >, M'[2 := V]. Then 
MxK = W(\r.M')Y(V) K 
—y (M[2:=V(V)]) K 
>, M'[e:=V] «kK 
(by Lemmas 2.5 and 2.9) 
= Nxk. 
Case 2: sucec; y C141. Then 
M x K -», U(succ) U(c)) K +, K (sucec;) Sy K ci41 = Nx K. 
Case 3: predcg —, Co. Similar to the previous case. 
Case 4: pred cj41 —, cy. Similar to the previous case. 
Case 5: cond cg M, M2 -+, M,. Then 
M x K = cond co (M, K) (Mz K) », Mix K 
by Lemma 2.9. 
Case 6: cond cy4; My Mz >, Mo. Similar to the previous case. 
Case 7: (uf.M') +, M'[f := pf.M’). Then 
MxK = (uf.M)K 
—, (M'[f := pf.) K 
o>, Mf := uf.M|« K 
by Lemmas 2.6 and 2.9. 


In the induction case we consider proofs of length greater than 1, and divide into cases 
depending on the last operational rule used: 


Case 1: B —, B’ implies cond B M; Mz -, cond B’ M, M2. Note that B cannot bea 
value; hence if B’ is not a value, 


MxK = Bx(Am.condm(M, K) (Mz K)) 
+, Bi’ x(Am.cond m (My K) (Mp K)) = Nx K 
by the induction hypothesis. If B’ is a value, then 
M x K -», cond ¥(B’)(M, K)(Mz K)=NxK. 


Case 2: P —, P’ implies suce P >, succ P’. P cannot be a value, so if P’ is not a 
value, 


MxK = Px(An.(Az.AK.K (succ 2)) n K) 
ey Pl x (And Az.AK.K (sucez)) n K) = N*&K 


by the induction hypothesis. If P’ is a value, then 
Mx K —, (AzAk.« (succ 2)) Y(P') K= Nx K. 
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Case 3: P —, P’ implies pred P —, pred P’. Similar to the previous case. 
Case 4: Q —, Q' implies (Az.P) Q —, (Az.P) Q’. Similar to the previous case. 
Case 5: P >, P' implies P Q >, P' Q. P cannot be a value, so if P’ is not a value, 


MxK = Px(Am.Q (Anmn K)) 
+, Pix (AmQ (Anmn K))=NxK 


by the induction hypothesis. If P’ is a value and Q is not, then 


MxK -, Q(An¥(P')n K) 
yy Qx(AnW(P)n K)=Nx«K. 


by Lemma 2.9. If both P’ and Q are values, then 


MxK -, Q(AnW(P’)n K) 
+, (PP) Y(Q)K=aNx«K. 


As all operational rules have been considered, we are done. | 


These facts about administrative and non-administrative reductions on continuized 
terms give us the ability to prove the following theorem originally due to Fischer [9]: 


Theorem 2.11 (Adequacy) /f M is a complete program, then 
Eval,(M)=c iff Eval,(M (A2°.2)) = e1. 


Proof: (=) Suppose Eval,(/Z) = c; then we know that M -», c;. By Lemmas 2.9 and 
2.10 we then have 


M (Az.t) -», M x (Az.t) ey G* (Azz) Hy C1. 


Thus, Eval,(M (Az.r)) = cp. 
(<=) Suppose Eval,(M) is not defined. Then 


M—, M, >, M2.-»... 
By Lemmas 2.9 and 2.10, we thus know 
M (Aa.z) -»y M x (Az.2) yy My x (Az.2) +) Mg x (Az.t) Hy... 


so Eval,(M (Az°.z)) is not defined either. a 
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Chapter 3 


Continuations May Be Unreasonable 


The Adequacy Theorem establishes a strong connection between the evaluation of terms 
and their continuized versions. The theorem easily extends to reasoning about complete 
programs, viz., proving observational congruences. It follows that for complete programs 
M and N, 


M =°,, N iff M (Az.c) =¥,, N (Az.z). 


The connection between direct and continuized versions of higher-order terms is less obvious, 
but one may still see a partial relationship between reasoning on direct versus reasoning on 
continuized terms: 


Corollary 3.1 If M =%,, N, then M =%,, N. 


Proof: Suppose M and N were distinguishable by some context C{-J. Then by the Ade- 
quacy Theorem, the context C[-] (Az.x) would distinguish M and N,acontradiction. MM 


In particular, if one can distinguish two terms by a context, the transforms of those terms 
will also be distinguishable. 

The problem with the continuation transform is that the converse of Corollary 3.1 does 
not hold: observational congruence on direct terms does not coincide with congruence on 
continuized terms. Similar anomalies occur in the other two settings. For example, suppose 
we augment X, with the call/cc-like operators C and A defined in [7, 8]. Terms that are 
observationally congruent in A, may become distinguishable using contexts containing these 
new operators. In the case of continuation semantics, there are observationally congruent 
terms that are equivalent in a direct semantics but not equivalent in a continuation seman- 
tics. Reasoning principles based on observational congruence may thus become unsound in 
settings involving continuations. 

In the continuation transform setting, the anomaly is manifested at terms of higher 
type. In particular, two higher-order closed terms may be observationally congruent but 
their transforms may not be 


Theorem 3.2 There exist two closed, pure (i.e., containing no constants, conditionals, or 
recursion) terms, namely 


My = x99? Ny? A2?.(Aw.t zw) (y Z) 
My = Xe ye Ae 2 (YZ), 


with My =”,, Mz but My #°,, Mo. 
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Proof: To show M, =%,, M2, we proceed in a purely operational fashion using Theo- 
rem 2.3.! We first show that M, <, M2. Pick any values V1, V2, and V3—then M2 -, V’, 
Mo Vi; >, V" and M2 Vy Vo », V"", so all vectors V of length 0, 1, or 2 make the statement 


M, V +, Vg implies Mz V -», Vj 
hold. Now suppose M, V; V2 V3 >, cy. Then 
M, Vi; V2 V3 ay (Ad.Vy V3 d) (V2 V3) +>, Cl 


so it must be the case that V2 V3 +, V’ and Vy V3 +, V” for some values V‘ and V”. 
Therefore, 
M2 Vi V2 V3» “Vi ‘V3 (V2 V3) 
a y"” y! 
yy Ch. 


Thus, by Theorem 2.3, M, <, M2. Using a similar argument, one can show M2 <, M;. 
To show that M; #¥,, Mo, we first reduce M, and M2 using =y: 


My =p Akosko (At-AKy-Ky (AYAK2.Ke (AZzAK3.y 2 (Anz z (Am.m n 3))))) 
Mz =» Ako-Ko (At.AK Ky (AyAKaeK2 (Az-AKg.@ 2 (Amy z (An.m n K3))))) 
(where the types have been omitted for clarity.) Intuitively, the difference between My and 


Mz comes from a difference in the way M, and M2 are reduced when applied to arguments: 
M, evaluates (y z) first, while M2 evaluates (x z) first. The typable context 


CL] = [-] No, where 
No = Xp.p (Aa.Xb.c1) Ni, where 
Ny = Aqg (Aa.Ab.cz) No, where 
N, = Arr cy (Aaa) 


distinguishes M, and Mp, since C[M,] terminates with result cz and C[M9] terminates with 
result ¢1: 


C[My] =, (Aa.Xb.cg) €1 (An.(Aa.Ab.c1) c, (Am.m n (Xa.a))) 


=v 2 
C[M2] =. (Aa.Ab.c1) er (Am.(Aa.Xb.c2) cy (An.m n (Xa.a))) 
=» Cj. 
Thus M; #¥,, Mo.? = 


Using a marked language (cf. Appendix), one can show that the untyped versions of My 
and M2 are congruent in any untyped context. Nevertheless, a simple typable context using 
only numerals distinguishes their transforms. 


Other techniques exist for verifying congruences: one may rely upon either an adequate or fully-abstract 
denotational semantics or upon an equational system sound for =%,, yet strong enough to prove the congru- 
ence [12, 20]. Either method rests upon a nontrivial adequacy or soundness proof. Plotkin [18] claims both 
methods can be used to prove Mf, =5,, Mo, using either pre-domains [22] or Moggi’s Ap [16], but I have not 
worked through the proofs of adequacy of the pre-domain semantics or soundness of Ap for =5,,- 

In fact, a stronger statement is true: M; 4,» Mz and Mz 2, Mi. 
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The Adequacy Theorem clarifies why M, #%,, Mz: a context with “illegal” continuations 
distinguishes the continuized terms. One could sensibly argue that M, and M2 should not 
be distinguished, since the distinguishing context will never arise under the intended uses 
of M, and Mp. But granting this, the theorem nevertheless points out a legitimate concern: 
what methods shall we use to prove that two terms are congruent with respect to all “legal” 
contexts, and what exactly are the legal contexts? This question might arise if we wanted 
to justify a post-transform code optimization in which transformed code M was replaced 
by an “optimized” expression N equivalent to M in all legal contexts. For any No, N itself 
need not equal No. 

It is not surprising that Theorem 3.2 has an analog in the call/cc setting. Consider, 
for example, the language A, with the call/cc-like operator C and the abort operator A 
[6, 7, 8]. More precisely, A, has the same syntax as the untyped version of X, (1.e., where no 
variables are decorated with types, and terms need not be well-typed), with the additional 
terms C M and AM. The reduction relation for X., -,, is defined by the rules 


(AM)N >=, AM (CM)N >. C(AK.M (Am« (m N))) 
V(AM) >, AM V(CM) —, C (AK.M (Av.K (V v))) 


and the outermost computation rules (which are only applicable in empty contexts) 
AM >.>. M CM ep. M (Az.Az) 


in addition to the (untyped versions of) rules of -,. Let -», be the reflexive, transitive 
closure of (—, Ub,), and let =S,, denote the observational congruence relation on terms of 
A- when observing numerals. Then 


Theorem 3.3 If M, and My, are the terms above, M, #5,, M2. 


Proof: Let C{-] be the context [-] (Az.Q) (Ay.C (Az.c1)) c1. Here, 2 is any divergent term 
(such as yw f.f.) This context forces C[M2] to diverge but makes C[M/] converge to cy: 


C[M] >. (Aw.(Aa.Q) cr w) ((Ay.C (Az.c,)) €1) 
ae (Aw. Az.) cy w) (C (Az.c1)) 
+e C (AK.(Az.c1) (Av.K ((Aw.( Ar.) c1 w) v))) 
Be (AK. (Az.c1) (Av.K ((Aw.(A2.Q) c1 w) v))) (Az.A 2) 
—, (Aa.cz) (Av.(Az.A &) ((Aw(ArQ) cy w) v)) 


C[M2] -. ((At.Q) er) ((AyL (Az.e1)) ¢1) 


ae OQ ((AyC (Az.¢1)) cr) 
re OQ ((AyC (Az.c1)) 1) 


Thus, M, #4, Mo. 7 


The particular terms M, and Mp2 can also be used to point out problems with continu- 
ation semantics. If one bases the semantics of X, on the transform, i.e. the meaning of a 
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term M is the meaning of M in some well-chosen model, two terms may be observationally 
congruent but fail to be equivalent in the model. The terms M, and M2 again provide the 
desired example. 

Less contrived examples appear in the literature. Mey-zr and Sieber, for instance, point 
out that two ALGOL blocks may be observationally congruent but not congruent if goto 
statements are allowed [14]. Since jumps are usually definable in a continuation semantics, 
the two blocks will not be semantically equivalent. Reasoning principles based on a con- 
tinuation semantics may thus lead one to conclude facts that are not true about the actual 
behavior of code. 

The failure of familiar reasoning principles seems to be known (albeit informally) in 
the community of compiler designers. In the presence of control operators or cps-converted 
code, typical compiler optimizations are unsound and procedure calls are often treated as 
“black holes.” But one need not conclude from the failure of some reasoning principles that 
the situation for continuations is a black hole. There are interesting reasoning principles 
which hold in continuation settings. For example, consider the X, terms 


P, Aa.Ab{Ax.2) ((Ay.y) (a b)) 
P, = iXa.rb.(Ax.2) (a 6) 


that are not provably equivalent using =,. In X, these two terms are observationally 
congruent, a fact proven by Felleisen {5] who has developed further principles for proving 
observational congruences in this setting. A setting involving continuations seems to require 
a new theory for reasoning about code. Such a theory remains to be found. 
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Chapter 4 


Conclusion 


Reasoning about the behavior of cps-converted code requizes additional assumptions if con- 
verted terms are to behave as their direct versions. Theorem 3.2 makes this formal: con- 
tinuations not arising in continuized contexts may distinguish cps-converted terms. Two 
possible approaches for a theory of continuations may be based upon this observation. 
One approach to a theory of continuations attempts to capture the notion of a “legal” 
continuation. An algebraic method along these lines is developed in [15] using retractions.' 


Definition 4.1 (Informal) A retraction pair (i,j) is a pair of functions such that for 
any 2,7 (ta) =a; 


Meyer and Wand define retraction pairs (at all types) that, when applied to a continuized 
term, supply the right continuations at the right time. Specifically, in the simply-typed, 
call-by-name A-calculus with no constants (A,, with 87 equational reasoning =,,), Meyer 
and Wand prove 


Theorem 4.2 (Meyer, Wand) For any type a, there exist ,,-definable retraction pairs 
(to,Ja) and (In,Jq), where ig: a7 a! , jg: a! + a, I, : a > ((a’ — 0) = 0), and 
J :((a' = 0) + 0) > a’, namely 


Ty = Aw AK?’ x 
93 ce: Ax079)>9,2 (da*.a) ifa =o 
= Nala’) Db An 7%,2 (Aa ~abK) ifaszaor 


de 


- ALPE ifa =o 
- Az?" Aa? I, (ir (2 (Jo @))) ifa=aror 


ee AL?.E ifa=o 
i. = Azle) Na? .j, (Jr (2 (ig @))) ifa=osr 
Moreover, M =n jo(Ja M) for any closed, pure term M. 


By applying the retractions, one can thus recover the meaning of a direct term from its 
continuized form.? 


‘Inclusive predicates have also been used to establish connecticns between the direct and continuation 
semantics of a language (24, 26, 29]. The inclusive predicate approach seems necessary in cases where the 
denotational domains are built recursively. 

?Even in the simplified setting of 4n, we cannot expect to have M = i(M) for anyi. This follows because 
there are two pure, closed terms M, N where M =n N but M and N A,-convert to distinct normal forms, 
namely the terms M = da.Xb.Ac.(Az.a) (bc) and N = AawAb.Acea. If + i(M) = M and t i(N) = N, then it 
would follow that + M = N which, by Statman’s typical ambiguity theorem [27], is equationally inconsistent. 
In the case of A,, we similarly cannot have M = i(M) for any 7 by Theorem 3.2. 
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Theorem 4.2 can be misleading as soon as recursion is added to the language. In the 
pure simply-typed calculus, call-by-name and call-by-value convertibility coincide since no 
term causes a divergent computation [2]. Because call-by-name equational reasoning is not 
sound for the observational congruence theory of X,, the retraction pairs above may not be 
appropriate for A,. In fact, the retraction pairs are no longer retractions: one can only show 
that M <, Ja(I[g M) and M =, ja(t, M). We conjecture that a similar reformulation of 
Theorem 4.2 holds.? 


Conjecture 4.3 For any closed term M of type a, M 4, ja(Jq M). 
This conjecture does not hold if we reverse the <,: 
Theorem 4.4 Leta = (0 0—-0- 0) > (0-0) > 0-— (0-0) and let 
S = rx.ry.Az.2 Zz (y Z) 
be of type a. Then ja(Ja S) Au S. 
Proof: In the proof of Theorem 3.2, we saw that 
S =p AkKo-Ko (At-AK Ky (AyAKoeK2 (AzAK3.0 2 (Amy z (An.m n &3))))) 

Using the fact that (i V) is =, to a value, we can find a simpler form for j(J S): 
JolJa S) =v Ja(ArAKKy (AyAK2.K2 (AzAK3.2 2 (Amy z (An.m n K3))))) 

=, Aaya (J (AkpeKy (AyAKack2 (AzAK3(7 ay) 2 (Amy z (An.m n K3)))))) 

=y Ady-Aag.j (J (AK2-K2 (Az-AK3.(2 a1) 2 (Am.(i a2) z (An.m n K3))))) 

=, AdyNag-Aa3-j(J (Ak3(i a1) (i ag) (Am.(i a2) (7 ag) (An.m n K3)))) 

=y AG -AGQ.AG3AG4s 

jo(Jo (AK+(i a1) (4 ag) (Am.(i a2) (7 ag) (An.m n (Aaa (¢ a4) &))))) 


Thus, in the typable context 
Cl: = (Az.c1) ([-] (Aa.Q) Vi V2) 


where V; and V2 are closed values, C[S] does not halt but C[j(J S)] +» a1. m 


It also remains open whether there is a ,-definable 7 such that M =%,, j(M) or even 
whether an interpretation of such a j exists in one of the standard semantical models of A,. 

Another approach to a theory of continuations involves finding general methods for 
proving observational congruences like P; and P2. A theory in this spirit might exploit the 
analogy between the three settings of continuation transform, continuation semantics, and 
call/cc-like congruence. W: conjecture that a precise match may be found among them. 


Conjecture 4.5 For appropriate choice of direct semantics D[-], continuation semantics 
C[-], continuation transform M, and observational congruence relation =¢,, using call/cc- 
like operators in contexts, 


M=,,N iff DIM] = D[N] 
iff C[M] =C[N] 
iff M =o, N. 


obs 


°The announcement in [13] of this result is withdrawn. 


20 


Establishing this conjecture clearly requires finding a suitably matched triple of transform, 
continuation semantics, and call/cc-like operaters, ¢.g., we obviously must not try to 
match up a call-by-value transform with a call-by-name direct semantics of a language with 
call/cc-like operators. 

Developing reliable principles for reasoning about continnations i is the ultimate goal of 


this research, and it is unclear (at this time) which of these twaa 
principles. Both avenues are being pursued. 


will yield general 
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Appendix A 
Standard Theorems for the Language 


The appendix is a compendium of some standard facts about the language A,. Similar 
results appear in {2, 19, 20] for call-by-name languages; the techniques for proving these 
facts carry over largely to the case of X,. The results are stated and proved with little 
comment. 


A.1 Church-Rosser Theorem 
We follow the proof in [2], using a technique due to Tait and Martin-Lof. 


Definition A.1 The relation >,, the parallel reduction relation, is defined inductively 
as follows: 


M=>, M predco =>p Co 
SUCCCZ =p C541 predc;41 pC; 
P => P' Q =p Q’ 
cond to PQ =>, P’ cond c741 PQ >, Q’ 
BS, BP Sp PSG SQ! M =, M’' 
cond B PQ =, cond B’ P’ Q’ Az.M =>, r2.M' 
M=>,M', N=, N' M=>,M', N>,V 
M N =>, M' N' (Az.M) N =>, M'|a := V] 
M =, M! M =, M’, uf.M =>, N 
uf~M =, wf.M' uf.M >, M'[f := N] 


Lemma A.2 If N =, N' and v is any variable, then M[v := N] >> M[v := N’). 


Proof: By structural induction on M. There are two cases to consider in the base case: 
Case 1: M =v; then M[v:= N] = N >, N’ = M[v:= N’). 
Case 2: M = v’ for v' some constant or variable not equal to z. Then 
Milyi= Nja=o sp 0) = MoS). 
There are six cases in the induction case: 
Case 1: M = Xv.P; then M[v:= N]) = M >, M = M{[v:= N’). 
Case 2: M = pv.P; thea M[v:= N) = M >, M = M[v:= N"). 
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Case 3: M = Xy.P. By induction, P[z := N] >, Pla := NJ; thus, 
M[z := N] >, M[z := N’). 

Case 4: M = pf.P. Similar to the previous case. 

Case 5: M = cond P; P2 P3; by induction, P;[x := N| =>, P,[z := N’]. Thus, 
Mig=.N] >, M[ec= iN): 

Case 6: M = (P, P2); by induction, P;{x := N] >, P,[z := N’], so 
M[z := N] >, M[z := N’). 


This completes the proof. | 


Lemma A.3 Suppose M =>, M' and N =, N’. If v is a X-variable and N is a value, then 
M[v:= N] >, M'[v:= N’). If v is a p-variable, then M[v := N] >, M'[v := N’). 


Proof: By induction on the definition of M >, M’. In the base case, there are four cases: 

Case 1: M'= M. By Lemma A.2, M[v:= N] Sp M’[v:= N’'). 

Case 2: M = succc; and M'=cj41. Then M[v:= N] = M =>, M' = M'[v:= NJ. 

Case 3: M = predco and M, = co. Similar to the previous case. 

Case 4: M = predcj4; and M, = c;. Similar to the :-revious case. 

This completes the base case. In the induction case, there are ten cases: 

Case 1: M = cond co P2 P3 and M' = P3. By induction, Pa[v:= N] >, Pfv:= N"). 
Thus, M[v := N] >, M'[v:= N’). 

Case 2: M = cond cj41 .°2 P3 and M’ = P3. Similar to the previous case. 

Case 3: M = cond P,; P2 P3 and M’ = cond Pj P} Pj. By induction, we know that 
Pi[v:= N|) >, Pi[v:= N’]. Thus, M[v:= N] >, M'[v := N’). 

Case 4: M = X2.P and M’ = \z.P'. If v = 2, then 

M[v:= N] = M =, M! = M'[v := N’). 
If v # z, then by induction Pl[v := N] >, P’[v:= N’], so M[v:= N] >, M’[v:= N’']. 

Case 5: M = PQ and M’ = P’ Q’. Similar to Case 3. 

Case 6: M = (Av.P)Q and M’ = P’[v:= Q"], where P >, P’, Q >p Q’, and Q’ is 
a value. By the induction hypothesis, Q[v:= N] >, Q’[v:= N’]. Also, since v is a 
A-variable, N must a value, so Q’[v := N’] must be< value. We can thus use the rules 
of >>: 

Mp Nes. P lec] Oa) SM fo a], 

Case 7: M = (Az.P)Q and M' = P'[x := Q’], where v # x, P >, P’, Q >> Q’, and 
Q’ is a value. By the induction hypothesis, P[v := N] >, P’[v := N’] and similarly 
for Q. If v is a A-variable, then Q’[v := N’]is a value since N is a value by hypothesis; 


if v is a p-variable, Q’jv:= N’] is a value no matter what N is since Q’ is a value. 
Thus, 


Mla Ni, Plage Ne = Oa S53 Fle Oe eet Ha te); 
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Case 8: M = pf.P and M' = pf.P’. Similar to Case 4. 


Case 9: M = pv.P and M’ = P'[v:= Q'], where P >, P’, M >, Q’. Note that v 
cannot be free in Q’, since it is not free in M. Thus, 


M[v:= N) =M >, M' = M"[v:= N’). 


Case 10: M =yuf.P and M’ = P'[f := Q’], where f # v, P >, P’ and M =, Q’. By 
induction, M{v := N] >, Q’[v := N’] and P[v:= N] >, P’[v:= N’]. Thus, 


M[v:= N] >> P'[v:= Nf := Q'[v:= NJ) = PU = Q'[v := N’] = M'[o := NV’). 


This completes the proof. a 


Lemma A.4 The relation >, is Church-Rosser. 


Proof: Suppose M >, M,; and M =, Mp2. To show that there is an M3 with M, >, M3 
and M2 =>, Ms, proceed by induction on the proof of M =>, Mj. In the base case, there 
are four cases: 


Case 1: M, = M. Pick M3 = Mg; this satisfies the conditions. 
Case 2: M = succc; and M, = c;41. Pick M3 = c;41; since M2 can only be M or Mi, 
this choice of M3 suffices. 


Case 3: M = predco and M, = cg. Pick M3 = co; as with the previous case, this M3 
meets the conditions since M2 can only be M or M4. 


Case 4: M = predcj4, and M, =c;. Pick M3 = c;; again, this choice suffices. 
This completes the base case. In the induction case, there are eight cases to consider: 


Case 1: M = cond co Pz ?3 and M, = Pj. Then M2 \s either Ps’ or cond co Py Py. By 
the induction hypothesis, there is a P;’ with Pj >, P; and Pj =, P3". Then picking 
Mz to be P3" works. 


Case 2: M = cond cj4; P2 P3 and M’ = P3. Similar to the previous case. 
Case 3: M = cond P; P2 P3 and M, = cond Pi P; P3, where P; >, P/. Then M2 


a 
is either Ps’, P3', or cond Pi’ Py’ P3/. By the induction hypothesis, there are P!” with 
Pl =p Pi" and Pi! =>, Pi". Then picking M3 to be either Pj’, P4’, or cond Pi” Py! P3’ 


(as appropriate) works. 
Case 4: M = »2.P and M, = Az.P', where P >, P’. Then M2 must also be of 
the form Az.P”. By induction, pick P’” where P’ >, P’” and P” =, P'. Then 
M3 = Aw.P™ will work. 
Case 5: M = (Az.P)Q and M, = P’[z := Q’], where P >, P’, Q >, Q’, and Q’ is a 
value. There are two subcases: 

Subcase i: Mz = (Az.P”)Q”. By induction, there are P” and Q” with P’ >, P” 


and P” >, P’; pick Q” similarly. Since Q’ is « value, Q’” must also be a value. 
Pick M3 = P![z := Q”); Mz >» Mz easily, and My, >, M3 by Lemma A.3. 
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Subcase ii: Mz = P"[z := Q”]. By induction, there are two terms P’” and Q’” 
with P’ >, P’” and P” =>, P; pick Q” similarly. Since Q’ is a value, Q’” must 
also be a value. Pick M3 = P”[a := Q”); then both Mz >, M3 and M; >, M3 


by Lemma A.3. 
Case 6: M = P Q and M, = P’ Q’, where P =, P’ and Q >, Q’. There are two 
subcases: 


Subcase i: Mz = P” Q". By induction, pick P’” with P! >, P” and P”’ >, P"; 
pick Q” similarly. Then M3 = P’” Q’ works. 
Subcase ii: P = Az.R and M2 = R"[x := Q”), for Q” a value. Then P’ = Az. RP’. 
By induction, pick R” with R’ >, R” and R” =>, R”; pick Q” similarly. As 
above, note that Q’” must be a value. Picking M3 to be R[x := Q”] works, 
since M, =>, M3 easily and Mz >, M3 by Lemma A.3. 

Case 7: M = (uf.P) and M, = wf.P’. 


Subcase i: Mz = pf.P”. By induction, pick P’” as before; then M3 = pf.P” 
works. 


Subcase ii: Mz = P"|f := Q"), where P >, P” and M =, Q”. By induction, 
pick P’” as before, and let M3 = P!”[f := Q"]; Mi >, Msg by the rules of >,, 
and M2 =>, M3 by Lemma A.3. 


Case 8: M = (uf.P) and M, = P’[f := Q’], where P >, P’ and M =, Q’. 


Subcase i: Mz = pf.P”. By induction, pick P’” as before and pick Q” where 
Q! >, Q” and M, =>, Q!”. Then M3 = P”"[f := Q”) works, since M, >, M3 
by Lemma A.3 and Mz >, Mg by the rules of >». 


Subcase ii: My = P"|f := Q”|, where P =, P” and M =, Q”. By induction, 
pick P’” and Q” as before, and let M3 = P’”"[f := Q’”"); then M, =>, M3 and 
M2 >, M3 by Lemma A.3. 

| 


Definition A.5 M =>, N iff M =, N using no instance of the symmetry axiom. 


Lemma A.6 M >) N iffM>,N. 


Proof: Let 2 be the relation of doing 0 or 1 =, steps without using the symmetry axiom. 
When treated as sets, the relations satisfy 


ee en 


. . eae mal . » +ae 
Since >, is the transitive closure of =", it is also the transitive closure of >,. ai 


Theorem A.7 The relation >, is Church-Rosser. 


Proof: Since =, is Church-Rosser, its transitive closure +% is also [2]. By Lemma A.6, 
=>, is Church-Rosser. | 
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The most important consequence of the Church-Rosser theorem is 
Theorem 2.2 If M =, N, then M =},, N. 


Proof: If M >, N, then C[M] >, C[N]. Thus, if either C[M] or C[N] reduce to c; under 
—»,, both of them will by Theorem A.7. The theorem then follows by an easy induction on 
the number of occurrences of the symmetry rule. | 


A.2 Applicative Congruence 


At each step, the relation —, reduces only one subterm. We call that subterm the active 
subterm [20]. An examination of the operational rules indicates that 


Definition A.8 The active subterm of a non-value, closed term M is 


e M if M is of the form (succc;), (predc;), (cond c; Mo M1), (uf.Mo), or ((At.Mo) V) 
for V a value; or 


e The active subterm in M', where M' is closed and not a value, if M has the form 


(succ M’), (pred M’), (cond M’ My Mj), (M’ Mo), or ((Av.Mo) M'). 
This definition matches the informal description of what the active subterm should be: 


Lemma A.9 Let M be a closed subterm of a non-value, closed term C[M], where M 
contains the active subterm of C[M] and C|-] has only one hole. Then if M >, M', 


C[M] >, C[M’). 


Proof: An easy structural induction on C[-]. a 


Lemma A.10 Let M be a closed subterm of a non-value, closed term C[M], where M 
contains the active subterm of C[M] and C[-] has only one hole. Then if M +, M’, 


C[M] +, C[M’). 
Proof: By induction on n, where 
M = Mo —y M, >y Mg yy... 9, Mn = MM". 


The base case, where n = 0, is trivial, so we proceed to the induction case. By the induction 
hypothesis, C[Mo] +, C[Mn-1]. A structural induction on C[-] shows that M,-1 contains 
the active subterm in C[M,_1]; thus, by Lemma A.9, C[Mn_1] -»v C[M,] so the lemma 
holds. wi 


Lemma A.11 (Activity) Let M be a closed term of type o and C[-] be a closed context 
with holes of type a. Then C[M] +, c iff either 


1. C[M’] +, c; for any M’; or 
2. (Aa.C[z]) M >, ¢7. 
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Proof: (=>) If M is a value, condition (2) holds immediately. So suppose M is not a value. 
We use a marking technique due to Bard Bloom. Add to the language A, the term #M for 
any term M, and add the reduction rules (and only these rules) 


#V —, V, V a value 


M—,N 
#M —, #N 


to the definition of the -, relation. Note that these rules do not change the computational 
behavior of A,, i.e., M —-», c iff erase( AZ) -», c; where erase(M) is the result of erasing all 
marks in M. 

Proceed by induction on the number of occurrences of #M in C[#M]. The base case 
(n = 0) is trivial. In the induction case, suppose that C[#M] -», cj. Let C’ be the first 
term whose active subterm is contained in a subterm of #M; if there is no such C”’, then 
condition (1) holds. Let C’ = D[#M], where D[-] has one hole. Since #M -», V for 
some unmarked value V, using the version of Lemma A.10 for the marked language we 
conclude that C’ +», D[V]. Note that there is a context E[-] with n — 1 holes such that 
D[V] = E[#M]. The context E[-] has the property that 


(Aa.C[z]) MM -, (Az.C[z]) V 
-, C[V] = E[V]. 


By the induction hypothesis, either E[M’] +, c; for any M’ or (Az.E[x]) M -, c. If the 
first condition is true, then (Az.C[z]) M -», E[V] +, c; so (Az.C[z]) M +, c. If the second 
condition is true, then (Az.C[z]) M +, E[V] +, c since (Az.E[z]) M -, E[V] + ¢7. 

(<=) Suppose (Az.C[z]) (#M) -, cy. Again, proceed by induction on the number of 
occurrences of #M in C[#M]. The base case (n = 0) is trivial, so consider the induction 
case. Examine the reduction sequence for C[#M], and pick the first C’ whose active 
subterm is contained in a #M; if there is no such C’, then C[M’] -, c; for any M’ so 
C[M] >, c. Let C’ = D[#M], where D[-] is a context with one hole and #M contains the 
active subterm in D[#M]. Then 


D[#M] -, DIV] = E[#M] 
where V is a value with #M +, V and E[-] is an unmarked context with n—1 holes. Since 
(Az.E[x]) (#M) +, 


by the induction hypothesis E[#M]-», ¢;. Since C[#M] -, E[#M], C[#M] >, c:. a 


Lemma A.12 Let Vo and V; be closed values of the same type. If Vo V’ <,y V; V' for any 
closed value V', then Vo <, V,. 


Proof: Again, we use the marking technique. Suppose C[#Vo] -») c; assuming, without 
loss of generality, that C[-] contains no marked terms. We proceed by induction on n, 
where an active subterm of the form ((#Vo) V’) (V’ any closed value) appears n times in 
the reduction. 
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In the base case, n = 0; thus, C[Vi] -», c trivially. In the induction case, pick the 
first term C’ in the reduction sequence with an active subterm of the form ((#Vo) V’). 
Let C’ = D[(#Vo) V'], where DJ-] has one hole and the hole is active. We know that 
D{(#Vo) V'] -» ¢:. By hypothesis, D[V; V’] +, c. Let E[-] be the context where 
DIV, V’] sy E[#Vo] and E[-] has no occurrences of #Vo. Since E[#Vo] -»y c with (n — 1) 
reductions of the form ((#Vo) V”) for some closed value V", by induction we conclude that 
E[Vi] +, c. The lemma now follows since C[V;] +, E[Vi] +, ev. | | 


Theorem 2.3 Let M and N be closed terms of the same type. Then M <, N iff, for all 
vectors V of closed values, 


MV -», Vg implies N V -», V{ and Vg = V{ if either is a numeral. 


Proof: (=) Trivial. 

(<) By induction on types. Consider first the base case, where M and WN are of type o. 
Suppose C[-] is a context in which C[M] -», ci; then we know by the Activity Lemma that 
either C[M’] +, ¢; for any M’ or (Az.C[z]) M -», c;. In the first case, C[N] +, c; trivially. 
In the second case, since M must reduce to some numeral, say cy, it must be the case that 
N -+y cy. Thus, C[N] -y c1, so M <, N. 

In the induction case, again consider any C[-] where C[M] -», c;. Then by the Activity 
Lemma, either C[M’] -», ¢; for any M’ or (Az.C[z]) M -», c;. In the first case, C[N] +, 7 
trivially. In the second case, M -», Vo for some closed value Vo. Since for any vector V of 
closed values, 

MV-, Vs implies N V -», Vj, 


it follows (using the empty vector) that N -», V, for some closed value Vj. By hypothesis, 
for any closed value V’, 


(M V')V -, q implies (N V’) V -, c7. 


By the induction hypothesis, M V’ x, N V’ for any V’. By Lemma A.12, since M =%,, Vo 
and N =%,, Vi, M x. N. | 
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